13 research outputs found

    A Hybrid Approach to the Sentiment Analysis Problem at the Sentence Level

    Get PDF
    This doctoral thesis deals with a number of challenges related to investigating and devising solutions to the Sentiment Analysis Problem, a subset of the discipline known as Natural Language Processing (NLP), following a path that differs from the most common approaches currently in-use. The majority of the research and applications building in Sentiment Analysis (SA) / Opinion Mining (OM) have been conducted and developed using Supervised Machine Learning techniques. It is our intention to prove that a hybrid approach merging fuzzy sets, a solid sentiment lexicon, traditional NLP techniques and aggregation methods will have the effect of compounding the power of all the positive aspects of these tools. In this thesis we will prove three main aspects, namely: 1. That a Hybrid Classification Model based on the techniques mentioned in the previous paragraphs will be capable of: (a) performing same or better than established Supervised Machine Learning techniques -namely, Naïve Bayes and Maximum Entropy (ME)- when the latter are utilised respectively as the only classification methods being applied, when calculating subjectivity polarity, and (b) computing the intensity of the polarity previously estimated. 2. That cross-ratio uninorms can be used to effectively fuse the classification outputs of several algorithms producing a compensatory effect. 3. That the Induced Ordered Weighted Averaging (IOWA) operator is a very good choice to model the opinion of the majority (consensus) when the outputs of a number of classification methods are combined together. For academic and experimental purposes we have built the proposed methods and associated prototypes in an iterative fashion: Step 1: we start with the so-called Hybrid Standard Classification (HSC) method, responsible for subjectivity polarity determination. Step 2: then, we have continued with the Hybrid Advanced Classification (HAC) method that computes the polarity intensity of opinions/sentiments. Step 3: in closing, we present two methods that produce a semantic-specific aggregation of two or more classification methods, as a complement to the HSC/HAC methods when the latter cannot generate a classification value or when we are looking for an aggregation that implies consensus, respectively: *the Hybrid Advanced Classification with Aggregation by Cross-ratio Uninorm (HACACU) method

    A Consensus Approach to the Sentiment Analysis Problem Driven by Support-Based IOWA Majority

    Get PDF
    In group decision making, there are many situations where the opinion of the majority of participants is critical. The scenarios could be multiple, like a number of doctors finding commonality on the diagnose of an illness or parliament members looking for consensus on an specific law being passed. In this article, we present a method that utilizes induced ordered weighted averaging (IOWA) operators to aggregate a majority opinion from a number of sentiment analysis (SA) classification systems, where the latter occupy the role usually taken by human decision-makers as typically seen in group decision situations. In this case, the numerical outputs of different SA classification methods are used as input to a specific IOWA operator that is semantically close to the fuzzy linguistic quantifier ‘most of’. The object of the aggregation will be the intensity of the previously determined sentence polarity in such a way that the results represent what the majority think. During the experimental phase, the use of the IOWA operator coupled with the linguistic quantifier ‘most’ (math formula) proved to yield superior results compared to those achieved when utilizing other techniques commonly applied when some sort of averaging is needed, such as arithmetic mean or median techniques

    Successes and challenges in developing a hybrid approach to sentiment analysis

    Get PDF
    This article covers some success and learning experiences attained during the developing of a hybrid approach to Sentiment Analysis (SA) based on a Sentiment Lexicon, Semantic Rules, Negation Handling, Ambiguity Management and Linguistic Variables. The proposed hybrid method is presented and applied to two selected datasets: Movie Review and Sentiment Twitter datasets. The achieved results are compared against those obtained when Naïve Bayes (NB) and Maximum Entropy (ME) supervised machine learning classification methods are used for the same datasets. The proposed hybrid system attained higher accuracy and precision scores than NB and ME, which shows its superiority when applied to the SA problem at the sentence level. Finally, an alternative strategy to calculating the orientation polarity and polarity intensity in one step instead of the two steps method used in the hybrid approach is explored. The analysis of the yielded mixed results achieved with this alternative approach shows its potential as an aid in the computation of semantic orientations and produced some lessons learnt in developing a more effective mechanism to calculating the orientation polarity and polarity intensity

    IOWA & Cross-ratio Uninorm operators as aggregation tools in sentiment analysis and ensemble methods

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.In the field of Sentiment Analysis, a number of different classifiers are utilised to attempt to establish the polarity of a given sentence. As such, there could be a need for aggregating the outputs of the algorithms involved in the classification effort. If the output of every classification algorithm resembles the opinion of an expert in the subject at hand, we are then in the presence of a group decision making problem, which in turn translates into two sub-problems: (a) defining the desired semantic of the aggregation of all opinions, and (b) applying the proper aggregation technique that can achieve the desired semantic chosen in (a). The objective of this article is twofold. Firstly, we present two specific aggregation semantics, namely fuzzy-majority and compensatory, which are based on Induced Ordered Weighted Averaging and Uninorm operators, respectively. Secondly, we show the power of these two techniques by applying them to an existing hybrid method for classification of sentiments at the sentence level. In this case, the proposed aggregation solutions act as a complement in order to improve the performance of the aforementioned hybrid method. In more general terms, the proposed solutions could be used in the creation of semantic-sensitive ensemble methods, instead of the more simple ensemble choices available today in commercial machine learning software offerings

    Main Concepts, State of the Art and Future Research Questions in Sentiment Analysis.

    Get PDF
    This article has multiple objectives. First of all, the fundamental concepts and challenges of the research field known as Sentiment Analysis (SA) are presented. Secondly, a summary of a chronological account of the research performed in SA is provided as well as some bibliometric indicators that shed some light on the most frequently used techniques for addressing the central aspects of SA. The geographical locations of where the research took place are also given. In closing, it is argued that there is no hard evidence that fuzzy sets or hybrid approaches encompassing unsupervised learning, fuzzy sets and a solid psychological background of emotions could not be at least as effective as supervised learning techniques

    Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis

    Get PDF
    There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexicon (off-line process) by human intervention to guarantee no noise is introduced into the lexicon, which prevents the classification system to provide an immediate answer; or (2) use the services of a word-frequency dictionary (on-line process), which is computationally costly to build. This paper investigates an alternative approach to compensate for the lack of ability of a lexicon-based method to produce a classification output. The method is based on the combination of the classification outputs of non lexicon-based tools. Specifically, firstly the outcome values of applying two or more non-lexicon classification methods are obtained. Secondly, these non-lexicon outcomes are fused using a uninorm based approach, which has been proved to have desirable compensation properties as required in the SA context, to generate the classification output the lexicon based approach is unable to achieve. Experimental results based on the execution of two well-known supervised machine learning algorithms, namely Naïve Bayes and Maximum Entropy, and the application of a cross-ratio uninorm operator are presented. Performance indices associated to options (1) and (2) above are compared against the results obtained using the proposed approach for two different datasets. Additionally, the performance of the proposed cross-ratio uninorm operator based approach is also compared when the aggregation operator used is the arithmetic mean instead. It is shown that the combination of non lexicon-based classification methods with specific uninorm operators improves the classification performance of lexicon-based methods, and it enables the offering of an alternative solution to the SA classification problem when needed. The proposed aggregation method could be used as well as a replacement of ensemble averaging techniques commonly applied when combining the results of several machine learning classifiers’ outputs

    A Hybrid Approach to Sentiment Analysis

    Get PDF
    This contribution presents a hybrid approach to Sentiment Analysis (SA) encompassing the use of semantic rules, fuzzy sets, unsupervised machine learning techniques and a sentiment lexicon improved with the support of Senti-WordNet. A Hybrid Standard Classification is first carried out, which is further enhanced into a Hybrid Advanced approach incorporating linguistic classification of semantic polarity modelled using fuzzy sets. The mechanism of the new SA methodology is illustrated by applying it to compute the polarity of a given sentence and to a benchmarking publicly available dataset: the Movie Review Dataset

    A Hybrid Approach to the Sentiment Analysis Problem at the Sentence Level

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The objective of this article is to present a hybrid approach to the Sentiment Analysis problem at the sentence level. This new method uses natural language processing (NLP) essential techniques, a sentiment lexicon enhanced with the assistance of SentiWordNet, and fuzzy sets to estimate the semantic orientation polarity and its intensity for sentences, which provides a foundation for computing with sentiments. The proposed hybrid method is applied to three different data-sets and the results achieved are compared to those obtained using Naïve Bayes and Maximum Entropy techniques. It is demonstrated that the presented hybrid approach is more accurate and precise than both Naïve Bayes and Maximum Entropy techniques, when the latter are utilised in isolation. In addition, it is shown that when applied to datasets containing snippets, the proposed method performs similarly to state of the art techniques

    Consensus in Sentiment Analysis

    No full text
    The objective of this chapter is to present a method applicable in group decision-making where computing the opinion of the majority of participants is key. In this article, we present a method that makes use of Induced Ordered Weighted Averaging (IOWA) operators to aggregate a majority opinion out of a number of Sentiment Analysis (SA) classification systems. The numerical output of each SA classification method is used as input to a carefully chosen IOWA operator that is semantically equivalent to the fuzzy linguistic quantifier ‘most of’. The object of the aggregation will be the intensity of the previously determined sentence polarity in such a way that the results represent what the majority thinks

    A Fuzzy Approach to Sentiment Analysis at the Sentence Level

    No full text
    The objective of this chapter is to present a hybrid approach to the Sentiment Analysis problem focused on sentences or snippets. This new method is centred around a sentiment lexicon enhanced with the assistance of SentiWordNet and fuzzy sets to estimate the semantic orientation polarity and intensity for sentences. This provides a foundation for computing with sentiments. The proposed hybrid method is applied to three different datasets and the results achieved are compared to those obtained using Naïve Bayes (NB) and Maximum Entropy (ME) techniques. It is demonstrated through experimentation that this hybrid approach is more accurate and precise than both NB and ME techniques. Furthermore, it is shown that when applied to datasets containing snippets, the proposed method performs similar to state-of-the-art techniques
    corecore